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Abstract 



The vertex-cover problem is studied for random graphs Gn,cN having N vertices 
and cN edges. Exact numerical results are obtained by a branch-and-bound algorithm. 
It is found that a transition in the coverability at a c-dependent threshold x = x c (c) 
appears, where xN is the cardinality of the vertex cover. This transition coincides with 
a sharp peak of the typical numerical effort, which is needed to decide whether there 
exists a cover with xN vertices or not. Additionally, the transition is visible in a jump 
of the backbone size as a function of x. 

For small edge concentrations c <C 0.5, a cluster expansion is performed, giving very 
accurate results in this regime. These results are extended using methods developed in 
statistical physics. The so called annealed approximation reproduces a rigorous bound 
on x c (c) which was known previously. The main part of the paper contains an application 
of the replica method. Within the replica symmetric ansatz the threshold x c (c) and the 
critical backbone size b c (c) can be calculated. For c < e/2 the results show an excellent 
agreement with the numerical findings. At average vertex degree 2c = e, an instability 
of the simple replica symmetric solution occurs. 

1 Introduction 

According to Garey and Johnson Jl , the vertex cover ( VC) problem belongs to the six basic 
NP-complete problems. Here VC is investigated for an ensemble of random graphs Gn, c n 
having N vertices and cN edges [E|, with c constant. Despite some efforts in the past J?], ||, 
no solution for the critical cardinality X c (c) of the vertex cover as a function of c has been 
found, but some lower and upper bounds were obtained. In this paper we investigate the 
problem with an exact branch-and-bound algorithm, a cluster expansion for small c and with 
methods borrowed from the statistical physics of disordered systems Q , see also Q . 

Our main result is the following, with W being the LambertW-function (x = W(ocje^^)): 
In the large- N limit and for c < e/2 (e Eulerian constant), the cardinality X c (c) of the min- 
imal vertex cover of a random graph Gn, c n is given by 



and the number of vertices being in the backbone (see below) of these minimal VCs reads 




(1) 




(2) 



For c > e/2, the expression given on the right-hand side of provides a lower bound on 
x c (c). 

The backbone is defined as follows: Usually for a graph different minimal vertex covers 
exist. A vertex which belongs either to all vertex covers or to no vertex cover of a given 
graph is said to belong to the backbone. 

Statistical mechanics methods were already applied to other famous NP-complete prob- 
lems, as e.g. if -satisfiability (KSAT) jl7| or number partitioning |Q. They are known to 
show interesting phase transitions in their solvability and, even more interestingly, in their 
typical case algorithmic complexity, i.e. in the dependence of the median solution time on the 
system size |2^, [l(J. Consider e.g. the satisfiability problem with the number of constraints 
per variable as a parameter. When this parameter exceeds a certain threshold, the solvabil- 
ity of a randomly chosen logical formula undergoes a sharp transition from almost always 
satishable to almost always unsatisfiable [|l6| . The hardest to solve formulae are found in the 
vicinity of the transition point. Far away from this point the solution time is much smaller, 
as the problem is easily fulfilled or hopelessly over-constrained. The typical solution times 
in the under-constrained phase are even found to depend only polynomially on the system 
size! Recently, insight coming from a statistical-physics perspective on these problems ]l7[ 
has lead to a fruitful cooperation with computer scientists, and has shed some light on the 
nature of this transition [ jT9| . Frequently, on the cost of not being mathematically rigorous, 
methods of statistical physics allow to obtain more insight than classical tools of computer 
science or discrete mathematics. This is true for the VC problem as well, as will be shown 
in this work. 

The paper is organized as follows. After this introductory section, the investigated model, 
related problems, and several notations are introduced. Some previously known rigorous 
bounds for the minimum cardinality of the vertex-cover are cited. In the third chapter VC 
is studied numerically with an exact branch-and-bound procedure. Then a cluster expansion 
for disconnected graphs with low average vertex degree is performed. Section 5 contains 
the main part of the paper: statistical physics strategies are applied. A short introduction is 
given, which relates several elements of graph theory to corresponding quantities appearing in 
physics. Then, two approaches are presented. The annealed approximation reproduces one of 
the above-mentioned rigorous bounds. More detailed insight is gained by the replica method. 
Using the replica symmetric ansatz, the threshold and the backbone size at the threshold can 
be calculated. The results are compared with the data obtained by the branch-and-bound 
method. In the last section conclusions and an outlook are given. 

2 The model 

2.1 Vertex cover and related problems 

In this section we want to introduce the investigated model. 

Take any graph G = (V,E) with the N vertices i £ {1,...,N} and M edges (i,j) E 
E C V x V . A vertex cover (VC) is a subset Vvc C V of vertices such that for every edge 
(i,j) € E there is at least one of its endpoints i or j in Vvc- We call the vertices in Vvc 
covered, whereas the vertices in its complement V \ Vvc & re called uncovered. 

Also partial covers are considered. In this case the set Vvc is not a VC and there are 
some edges (i,j) with i ^ Vvc and j £ Vvc- In this case we call the edge uncovered as 
well. The task of finding the minimum number of uncovered edges given a graph G and the 
cardinality X = \Vvc\ is an optimization problem. 

The corresponding decision problem, whether there exists a VC Vvc of fixed cardinality 
X = | Vvc I j with 1 < X < N, is according to Garey and Johnson || one of the six basic 
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NP-complete problems. So it is widely believed that one cannot construct any algorithm 
which solves the problem substantially faster than exhaustive search, i.e. only algorithms 
are known which have an exponential worst-case time complexity in N and M. 

VC is related to other well-known and widely used NP-complete problems. The first one 
is the independent set (ISET) problem. An ISET is a subset Viset C V of vertices such 
that for all i, j G Viset we have £ E. So V \ Viset is obviously a VC for every ISET 
Viset, and every maximal ISET is the complement of a minimal VC. The independence 
number, defined as the maximum of cardinalities |V/sbt| of all ISETs, is consequently given 
by N - miny C \ V VC \- 

A clique is a fully connected subgraph. So, if the subset Viset C V is an ISET in 
G = (V,E), it is a clique in the complementary graph G = (V, V X V \ E). Finding the 
largest clique in one graph is equivalent to finding the largest ISET in the complementary 
graph. 

2.2 Random graphs 

In order to speak of median or average cases, and of phase transitions, we have to introduce 
a probability distribution over graphs. This can be done best by using the concept of random 
graphs as already introduced about 40 years ago by Erdos and Renyi ||. A random graph 
Gn.m is a graph with N vertices V = {1, N} and M randomly drawn edges such that any 
two instances (for fixed N, M) are equiprobable. 

An alternative description would be, to include an arbitrary pair of vertices with a certain 
probability p. For large N, the number of edges becomes almost surely pN 2 /2 + O(N), and 
both concepts can be identified by choosing p — 2M/N 2 . 

The regime we are interested in are finite connectivity graphs where the average vertex 
degree 2c = 2M/N stays constant in the large iV limit. Under this scaling of the edge 
number, the cardinality of the minimal VC should typically depend linearly on N as well, 
minyc \Vvc\ = x c (c)N. The main purpose of this paper is to show evidence that there is an 
asymptotically (N — > oo) sharp threshold x c (c) which depends for almost all graphs only on 
the average vertex degree 2c, and to find its functional dependence on c. 

Here we want to review shortly some of the fundamental results on random graphs which 
were already described in Q, and which are important for the following sections: 

The first point we want to mention is the distribution of vertex degrees d, in the limit 
N — ► oo it is given by a Poisson-distribution with mean 2c: 

Po 2c {d) = e~ 2c ^ . (3) 

A second point which is important for the understanding of the following is the component 
structure. For c < 1/2, i.e. if the vertices have in average less than one neighbor, the graph 
Gn,cN is built up from connected components which have up to O (log AT) vertices. The 
probability that a component is a specific tree Tk of k vertices is given by 

P(*)=e-**M^, (4) 

and is equal for all k k ~ 2 distinct trees. As the fraction of vertices which are collected in finite 
trees is X^fcli p(k)k k ~ 2 k = 1 for all c < 1/2, in this case almost all vertices are collected in 
such trees. For c > 1/2 a giant component appears which contains a finite fraction of all 
vertices, c = 1/2 is therefore called the percolation threshold. 
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2.3 Rigorously known bounds 



In this subsection we are going to present some previously known rigorous bounds on x c (c). 
A general one for arbitrary, i. e. non-random graphs was given by Harant |J who generalized 
an old result of Caro and Wei jj| . Translated into our notation, he showed that 



(^2i£V di + l) 



x c (G)<l-— i ' , . . , a (5) 

l^ieV d, + l l^{i,j)£E (di + mdj + l) 

where di is the vertex degree of vertex i. Using the distribution (||) of vertex degrees and 
its generalization to pairs of connected vertices, this can easily be converted into an upper 
bound on x c (c) which holds almost surely for N — > oo. 

The vertex cover problem or the above-mentioned related problems were also studied in 
the case of random graphs, and even completely solved in the case of infinite connectivity 
graphs, where any edge is drawn with finite probability p, such that the expected number 
of edges is = 0(7V 2 ). There the minimal VC has cardinality (N — 21og x /n- p -\ N — 

O (log log N)) H. Bounds in the finite-connectivity region of random graphs with N vertices 
and cN edges were given by Gazmuri [^| . He showed that 

xi(c)<x c (c)<l- 1 -^ (6) 
2c 

where the lower bound is given by the unique solution of 

= a:j(c)loga:,(c) + (1 - Xi(c))log(l - x t (c)) - c(l - x^c)) 2 . (7) 

As we will see later on, this bound coincides with the so-called annealed bound in statistical 
physics. The correct asymptotics for large c was given by Frieze 

x c (c) = 1- -(logc-loglog2c+l) + o(-) . (8) 
c c 



3 Numerical evidence for a phase transition 

To achieve a thorough insight into the nature of the problem, numerical simulations were 
performed. At first the branch-and-bound algorithm is explained which was implemented 
for this purpose. Then, results are presented which relate the transition in solvability to a 
change in the median-case time complexity. Also the dependence of the backbone (see below) 
on the cover size x shows a jump at this transition. 



3.1 The algorithm 

All numerical results were obtained by an exact enumeration. Using a branch-and-bound 
algorithm similar to Jl2|, all covers can be calculated: as each vertex is either covered 

or uncovered, there are 2 N possible configurations which can be arranged as leafs of a binary 
(backtracking) tree. At each node, the two subtrees represent the subproblems where the 
corresponding vertex is either covered or uncovered. The branch operation tries to find a 
solution by investigating both subtrees and keeping only the optimum solutions. 

First we concentrate on the algorithm which finds the configurations with the minimum 
number of uncovered edges for a given graph and a given number X of vertices which can 
be covered. We want to omit subtrees which for sure contain no optimum solutions: this is 
the case either if the number of covered vertices exceeds X or if the leafs of the subtree can 
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already be proven to be worse than previously considered configurations. Thus, it is possible 
to avoid branching into some subtrees by calculating the following bound: it uses the current 
vertex degree d(i), which is the number of uncovered neighbors at a specific stage of the 
calculation. By covering a vertex i the total number of uncovered edges is reduced by exactly 
d(i). If several vertices j\,j2, ■ ■ ■ ,jk arc covered, the number of uncovered edges is at most 
reduced by d{j\) + d{j2) + ■ ■ . + d(jk). Assume that at a certain stage within the backtracking 
tree, there are uncov edges uncovered and still k vertices to cover. Then a lower bound M 
for the best solution which can be found in the subtree is 



M = max 



0, uncov — max d(ji) + . . . + d(jk) 

31, ■■■,3k 



(9) 



The maximum is easily calculated by always storing the uncovered vertices sorted according 
their current degrees. The algorithm can avoid branching into a subtree if M is strictly 
larger than the number opt of uncovered edges in the best solution found so far. If one is 
interested only in an arbitrary minimum configuration instead of enumerating all, one can 
omit every subtree with M > opt. In the latter case the algorithm can be stopped as soon 
as a configuration with opt = is found. 

For the order the vertices are selected to be (un-)covered within the algorithm, the fol- 
lowing heuristic is applied: the order of the vertices is given by their current degree. Thus, 
the first descent into the tree is equivalent to the greedy heuristic which iteratively covers 
vertices by always taking the vertex with the highest current degree. Later, it will be become 
clear from the results that this heuristic is indeed a suitable strategy. 

The following representation summarizes the algorithm for enumerating all configurations 
exhibiting a minimum number of uncovered edges. Let G = (V, E) be a graph, k the number 
of vertices to cover and uncov the number of edges to cover. Initially k = X and uncov = \E\. 
The variable opt is initialized with opt = \E\ and contains the minimum number of uncovered 
edges found so far. The value of opt is passed via call by reference. At the beginning all 
vertices i are marked as free. The marks are considered to be passed via call by reference 
as well (not shown explicitly) . Additionally it is assumed that somewhere a set of (optimum) 
solutions can be stored. 

algorithm min-cover(G, k, uncov, opt) 
begin 

if k=0 then {leaf of tree reached?} 
begin 

if uncov < opt then {new minimum found?} 
begin 

opt := uncov; 

clear set of stored configurations; 
end; 

store configuration; 
end; 

if bound condition is true (see text) then 
return; 

let i € V a vertex marked as free of maximal current degree; 

mark i as covered; 

k:=k-l; 

adjust degrees of all neighbors j of i: d(j) :— d(j) — 1; 
min-cover(G, k, uncov — d(i),opi) {branch into 'left' subtree}; 
mark i as uncovered; 
k := k + 1; 
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(re)adjust degrees of all neighbors j of i: d(j) :— d(j) + 1; 
min-cover(G, fc, uncov, opt) {branch into 'right' subtree}; 
mark i as free; 
end 

In the actual implementation, the algorithm does not descend further into the tree as 
well, when no uncovered edges are left. In this case the vertex covers of the corresponding 
subtree consist of the vertices covered so far and all possible selections of fc vertices among 
all uncovered vertices. 

Now we discuss the case of finding a true VC of minimum cardinality, where the per- 
formance of the method can be enhanced by some extensions. The algorithm is called with 
fc = |V|, opt = and fc is passed via call by reference like opt. Now assume that during 
the execution of the algorithm a total cover (uncov = 0) is found and fc > 0. Thus it is 
possible to cover all edges with less than the allowed number of vertices. Consequently, it is 
not necessary to cover additional vertices, and the value of k is set to zero. Additionally the 
set of configurations which was stored before is cleared. Furthermore, whenever a vertex i is 
marked as uncovered, all its neighbors j can be covered immediately, because no uncovered 
edge should remain. Please note that in this case the degrees of all neighbors of the neigh- 
bors j of i have to be readjusted as well. After the initial call of this modified algorithm has 
finished, the variable fc contains the cardinality of the minimum vertex cover. 

The algorithm was implemented via the help of the LEDA library jl3| which offers many 
useful data types and algorithms for linear algebra and graph problems. Since the VC problem 
is NP-hard, the method exhibits an exponential worst-case time complexity. Although our 
algorithm is very simple, in the regime 0.5 < c < 5 random graphs up to size N = 100 could 
be treated for all values X £ [0,N]. For the calculation of covers of minimum cardinality, 
also graphs with N = 140 could be considered. Please note that for c < 0.5 the graphs can 
be divided into many connected components of sizes up to O(logTV). Then, in the case one 
is interested only in the cover of minimum cardinality, the algorithm can be applied to each 
component separately, yielding only a polynomial time-complexity 

3.2 Numerical results 

A first evidence for a peak of the typical case complexity near the threshold was given in S 
where the problem was matched to SAT and solved with the Davis Putnam procedure. The 
running time was measured for graphs of size N = 12. Here, systems up to size N = 140 
are investigated. Since data for several different graph sizes are available, it is possible to 
extrapolate the behavior of the infinite graph using finite-size scaling techniques. The results 
of this extrapolations will be presented in a subsequent chapter, along with the outcomes of 
analytical calculations. 

In Fig. [j]the probability P ccm (x) of finding a vertex cover of cardinality xN for a random 
graph Gn,cN is displayed for c = 1 and different values of N (10000 instances per value of x, 
1000 for N = 100). The drop of the probability from one for large cover sets to zero for small 
cover sets obviously sharpens with N. Thus, a jump at a well-defined x c (c) is to be expected 
in the large- N limit: Above x c (c) almost all random graphs with cN edges are coverable 
with xN vertices, below x c (c) almost no graph has such a VC. The curves in the left part of 
the figure show the average minimal fraction e(x) of uncovered edges, which for a coverable 
graph is obviously zero. In the large-TV limit, the disappearance of positive e(x) coincides 
with the threshold. 

It is very instructive to measure the median computational effort, as given by the number 
of visited nodes in the backtracking tree, in dependence on x and N. The curves which are 
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exposed in Fig. show a pronounced peak at the threshold value. Inside the coverable 
phase, x > x c (c), the computational cost is growing only linearly with N, and in many cases 
the heuristic is already able to find a cover with xN vertices. Below the threshold, x < x c (c), 
it is clearly exponential in TV (see inset). This easy-hard transition resembles very much the 
typical-case complexity pattern of 3SAT and deserves some more detailed investigation, 
which will be provided by the analytical calculation later on. 

In Fig. H the median time is plotted separately for the subset of coverable and uncoverable 
graphs, respectively. In addition, a scatter plot is included, which contains a dot for each 
result for 100 graphs and for different cardinalities xN of the cover. For a given graph Gn, c n, 
as long as it is not coverable with xN vertices, the computer time grows heavily with x. But 
as soon as a graph is coverable, it takes only a small computational effort to find a cover. 
The reason that median effort over all graphs is reduced for x > x c is that the fraction of 
uncoverable graphs decreases rapidly. 

Another quantity is directly related to the transition: The outcome of the algorithm is 
a configuration, i.e. a vector of marks telling whether a given vertex is covered or not. For 
a given graph and a given fraction x usually different configurations are feasible, exhibiting 
all the same minimal number e(x)cN of uncovered edges. An enumeration shows that the 
number of these configurations grows exponentially with the system size for all values of N. 
Nevertheless, for x < x c (c) there is always a finite fraction of vertices which behave equally 
in all different configurations: they are either always covered or always uncovered. The set 
of these vertices is the backbone B. 

For x > x c and in the large-TV limit, there is no non-empty backbone: the graph is 
already coverable with x c (Gn, c n)N vertices, the other (x — x c )N can be distributed freely. 
This already excludes the existence of vertices being always uncovered. The maximal vertex 
degree in a random graph Gn, c n grows only as O(logiV). So the neighbors of every covered 
vertex can be covered with some of the remaining (x — x c )N free cover marks, and the central 
vertex itself can be uncovered and thus does not belong to the backbone. 

Later we will see that directly at the threshold finite backbone size b(x) = \B\/N 

appears. Thus, for N — > oo the function b(x) exhibits a discontinuity at x c (c). This is 
indicated by the results obtained from the numerical calculations, again for the case c = 1, 
see Fig. ||. For x < x c (l) the relative backbone size b(x) is large and almost independent 
of N. For x > x c (l) a sharp decrease can be observed, which pronounces with increasing N. 
A surprising result is obtained, when we study coverable and uncoverable graphs separately. 
This can be done only in the vicinity of the transition, x ~ x c (l), where coexisting coverable 
and uncoverable graphs can be found for finite N. The inset of Fig. [| shows the result: 
Above the threshold, the coverable graphs exhibit a smaller backbone, as expected from the 
discussion above. But the curves intersect near x c (l). This behavior is observed for all 
graph sizes N, and the effect becomes more pronounced with increasing system size. As an 
explanation, we take a look at graphs being coverable with a small number of vertices. Their 
distribution of vertex degrees must deviate substantially from (^), showing more vertices 
with high degree. These vertices are expected to be in the backbone with high probability, 
see also the discussion on the correlation between vertex degree and backbone at the end of 
section 5.3.1. Consequently, the backbone is expected to be very large. The crossing of both 



curves close to x c seems to be accidental. By measuring the intersection as a function of N 
and extrapolating to N —> oo, the limiting value is found to be significantly below x c . 

We have seen that the vertex-cover problem exhibits several peculiar features. These are 
worth to be addressed by analytical methods which allow to reveal the structure of VCs. 
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4 Cluster expansion for low vertex degrees 



One of the classical results on random graphs is, as mentioned in section 2.2, that for low 
edge densities c < 1/2 almost all vertices are collected in finite trees, as 



fc-2 ; 



(10) 



fc=l 



with p(k) being the distribution of trees Tk with k vertices, cf. section 2/2. So the threshold 
x c and the corresponding backbone b are given by 



fc=i 

oo 

b(c,x c (c)) = ^2p(k) 



k=l 



(11) 



where X^T fc denotes the sum over all different trees Tk- X c (Tk) (resp. B c (Tk)) is the cardi- 
nality of the minimal VCs (resp. of their backbone) of 2\. 

For very small average vertex degrees c <C 0.5 the most vertices are furthermore concen- 
trated in small components, and we can produce good approximations for the threshold, the 
backbone etc. by counting small trees. There also the distinction between backbone and 
non-backbone vertices becomes evident: Consider e.g. a connected component consisting 
only of two vertices and one edge. To cover this minimally, we need exactly one vertex - but 
it is not specified which one. The vertices do not belong to the backbone at threshold, and 
they give a contribution to a finite entropy (i.e. an exponential number) of minimal VCs. 
The situation is different for a tree of three vertices and two edges. The minimal cover is 
unique: Only the central vertex has to be taken. Consequently all these three vertices belong 
to the backbone at the threshold. Already at this point, the partial freezing of degrees of 
freedom as observed in SAT jL7|, [l9| becomes evident. 

We have counted the optimal covers for trees up to 7 vertices, see the results in table 
1. The values for the threshold and the backbone are lower bounds as a certain fraction of 
vertices is not included. Upper bounds are provided by adding the fraction of missing vertices 
to the lower bounds. For small c these bound are very precise, e.g. for c = 0.1, 99.98% of 
all vertices are already included in the small trees up to size 7. These approximate values 
will be a useful testing ground for the statistical mechanics calculations which are given in 
section |[ 

This tree size expansion is not longer possible above the percolation threshold c = 1/2. 
There the giant component arises which includes a finite fraction of all vertices. 



5 Statistical mechanics approach 

In this section we use the strong similarities between combinatorial optimization and sta- 
tistical mechanics. The cost function of a system which shall be optimized corresponds to 
the energy function (or Hamiltonian) in statistical mechanics. The elements of the definition 
space of the cost function are called microscopic configurations. The main aim of statistical 
mechanics is the description of the macroscopic behavior of a microscopically defined model, 
e.g. the prediction and description of phase transitions. 
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c 


U.Uo 


U.l 


n 1 r 

U.lo 


U.z 


n ok 
U.zo 


U.o 


U.4 


U.O 


V 


0.999997 


0.9998 


0.998 


0.991 


0.97 


0.94 


0.84 


0.71 


•Emin 




n nsztn 

u.uo^u 




1 A"\ 
u. i^to 




n 1 7 

u. ± t 


n 1 7 

U. -L 1 


U. -LfJ 


%max 


0.045579 


0.0842 


0.118 


0.151 


0.19 


0.23 


0.33 


0.44 


bruin 


0.916684 


0.8572 


0.812 


0.774 


0.74 


0.70 


0.61 


0.51 


bmax 


0.916687 


0.8574 


0.814 


0.781 


0.77 


0.76 


0.77 


0.80 


Smin 


0.028774 


0.0488 


0.063 


0.073 


0.078 


0.08 


0.08 


0.07 


S?nax 


0.028775 


0.0489 


0.064 


0.076 


0.088 


0.10 


0.13 


0.17 


x c (c) 


0.045577 


0.0841 


0.117 


0.146 


0.173 


0.196 


0.237 


0.272 


bc(c) 


0.916686 


0.8573 


0.813 


0.779 


0.753 


0.731 


0.700 


0.678 



Table 1: Results of the cluster expansion for trees having up to 7 vertices and several values of 
c. v denotes the fraction of vertices which are included in the considered trees, Nx m i n / max 
give lower and upper bounds on the number of vertices which are needed to cover these 
components, b min / max are backbone bounds, s min / max bounds for the VC entropy. These 
values are to be compared with the analytical results of the replica approach which are 
presented in the last two lines. 

5.1 General strategy 

In order to describe the VC phase transition also beyond the percolation threshold, we are 
going to use the tools of the statistical mechanics of disordered systems [|l5|. We therefore 
map the random graph to a disordered spin system with an Hamiltonian which shall be 
minimized. A canonical choice for the "energy" of a subset V C V of vertices is given by the 
number of uncovered edges: 

H({Sih {Ji,j}) = 2 E Jv*s t ,-i6s t ,-i (12) 

where Ji j are the entries of the symmetric adjacency matrix, they are equal to one whenever 
there is an edge connecting the vertices i and j, and zero else. The diagonal elements are 
identically set to zero. The covering state of the vertices is mapped to a configuration of N 
Ising-spins Si — ±1: we choose Si = +1 if i G V, i.e. if the vertex i is covered, and Si = —1 
if i is uncovered. Non-zero contributions to the Hamiltonian result only from edges having 
two uncovered endpoints. 

The decision problem whether there exists any VC with xN vertices can be answered by 
minimizing H under the constraint 

i N 

-5> = 2z-l (13) 

»=i 

which fixes the cardinality of the cover set, or in physical terms, the global magnetization 
of our Ising-spin system. If this restricted minimal energy equals zero, then there are no 
uncovered edges left, and the decision problem can be positively answered. If, on the other 
hand, a positive minimal energy is found, there does not exist any VC of cardinality xN, but 
the ground state energy gives the best compromise by describing the configuration with the 
minimal number of uncovered edges. 

In statistical mechanics every microscopic configuration {jSi}<=i jy is assigned a proba- 
bility proportional to the Gibbs- weight exp{— T~ X H ({S^})} at temperature T. By decreasing 
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T, this weight becomes more and more concentrated in low-energy configurations and finally, 
at T = 0, counts only the ground states, i.e. the configurations minimizing the Hamiltonian. 
In order to characterize these in the VC problem, we introduce at first a non-zero formal 
temperature T and calculate the partition function 



Z{T,x\{. h] }) = £ exp^^MMlj (14 ) 

C»({Si}) 



where we sum only over the set C x ({Si}) of configurations {Si}i=i,..,jv which satisfy the 
magnetization constraint (|l3|). From this we may calculate the free-energy density 

T 
N 



f(T,x\{J id }) = --logZ(T,x\{J itj }) (15) 



which in its zero temperature limit gives the desired ground state energy density: 

e GS (x {•/.,,! :• = Hm ./V/.r {J,J) . ( K») 



This energy does still depend on the particular realization of the graph encoded in the matrix 
{Ji.j}. In the limit N — > oo (with c — M/N — const.) we expect however the free energy to 
be self- averaging , and so we are only interested in calculating 



e GS (.T, c) = lim f(T, x, c) = lim lim f(T, x\{J i:j }) (17) 

1 — 1 — >U iv — 'OO 

where the over bar stands for the average over the ensemble of random graphs with N vertices 
and cN edges. Another interesting quantity is the ground state entropy 

1 



s GS (x,c)= lira — logAf G s(x, { Ji,j}) (18) 

N~>oo 1\ 

where Afcs{x, {Ji,j}) is the number of ground states with cardinality xN in the graph given 
by {Jij}- It is also useful to consider the VC entropy 

( A _ j s GS (x,c) if e GS (a;,c) = , - 

s vc(x,c)-{ _ qo elge (19) 

which measures the number of VCs. 

5.2 The annealed approximation 

Before trying to calculate this, we will present the so-called annealed approximation. We use 
the bound 

log Z(T,x\{J^}) < logZ^xKJ^ }) (20) 

for the average of the logarithm of the partition function in terms of the logarithm of the 
average of the partition function. It holds because the logarithm is a concave function. We 
easily calculate the annealed entropy, see Appendix [A| for details, 

1 



s ann (x,c) = lim lim — log Z(T, x\{ Jij}) 

T — >0 N^-oc iv 

= -xlogx- (1 - x)log(l - x) - c(l - x) 2 (21) 
and can bound the VC entropy 

svc{x,c) < s ann (x,c) . (22) 
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VCs can thus only exist if the annealed entropy is non- negative, and x c (c) is bounded from 
below by x an n(c) which is given by s ann (x ann (c), c) — 0, i.e. by the inversion of 

„ _ -Xann(c) logX ann (c) - (1 - X ann (c)) log(l - X ann (c)) 



(1 ^ann(c))^ 

This is exactly the lower bound given in Q which is not surprising as Gazmuri used a very 
similar reasoning. 

5.3 The replica approach 

If we want to go beyond the annealed approximation, we have to average the logarithm of the 
partition function over the disorder. Unfortunately this cannot be achieved directly, the way 
out is given by the so-called replica trick, a non-rigorous method which is well-established 
in the physics of disordered systems |L5| . Details of the calculation are exposed in appendix 
[B| There we show the derivation of the so-called replica symmetric approximation of the 
free-energy density 

/°° dh dh 
-— e - thk P FT (k)ilogP FT (k) - l]log2coshT- 1 / i (24) 
-oo 27T 

/oo 
dhidh 2 P(hi)P(h 2 ) log 
OO 



p -T-\h x +h 2 ) 

l-(e- T -1)- 



^coshT-i/ii coshT- 1 ^ 

This quantity has to be optimized with respect to the order parameter P(h) which is again 
restricted by the magnetization constraint to 

/•OO 

2x-l= dhP{h)taxihT~ l h . (25) 



PpT(k) denotes the Fourier-transform of P(h). 

The physical interpretation of the order parameter in terms of the effective field distribu- 
tion is straightforward: P(h)dh gives the probability, that a randomly chosen site i € V has 
local magnetization rrii — (Si}T = tanh T _1 /i. This distribution (or the distribution of local 
magnetizations) is the typical order parameter in disordered finite connectivity models, cf. 
fill , [T^| . It is determined by the optimization equation for the free energy (|24| ) which reads 



J dh P{h) e T lhs = exp j-2c- As + 2cjdh P(h) 



1 + ^-1) 



1 



1 



(26) 

The Lagrange parameter A in the exponential has to be adjusted in order to meet the mag- 
netization constraint (p5|). 

This equation as well as the expression ( |2^ ) for the free energy still depend on the formal 
temperature T, and the limit T — > is not totally obvious: we have to clarify the scaling of 
the effective fields h with T. There are two main possibilities: 

• The fields h are proportional to the formal temperature, h = 0(T) for T — ► 0. As can 
be simply seen in the expression (^4| ) for the average free energy, we then also have 
f{T, x,c) = 0(T), and the ground state energy ecs(a;,c) vanishes. These fields are 
consequently found in the coverable phase with x > x c . Another important property 
is that the corresponding local magnetizations m = tanh T~ 1 h do not tend to ±1, and 
the corresponding spins take different orientations in different ground states. 
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• The fields h remain different from even if the temperature vanishes, h = 0(T°). The 
corresponding spins have ±l-ground state magnetization, and consequently take on the 
same value in (almost) all ground state configurations, i.e. they form the backbone. If 
we introduce such fields in (|2~i| ) we immediately find that f(T, x, c) does not vanish in 
the zero-temperature limit, the ground state energy becomes positive, and such fields 
cannot exist in the COV phase. Their appearance marks the transition. 

5.3.1 At the threshold 

If we would be able to solve (^ ) at finite temperature for arbitrary x and c, we could deduce 
the scaling directly from the solution - and thus we could determine x c (c). As this is to 
complicated to be achieved directly, we can plug in the two different scalings, and calculate 
the limit T — > 0. We then find two different equations for P(h) in the two different phases. 
The phase transition point is given by the matching of both equations: 

• If we reach the threshold from above, x — > x c {c) + 0, we are in the coverable phase. 
According to the above discussion, the effective fields are h = TH cov (x)z where H cov (x) 
describes the typical absolute value of the field and z is a random variable of finite mean 
and variance. For x — > x c (c) + the spins Si are more and more constraint, and at x c (c) 
a freezing takes place. The limit is therefore described by H cov (x — > x c (c) + 0) — > oo. 

• If we reach the threshold from below, x — > x c (c) — 0, we are in the uncoverable phase, 
and at least a finite fraction of all spins has to be frozen. The corresponding effective 
fields scale as ft, = H uncov (x)z where now the scale for the absolute value of h is 
described by H uncov (x). As we approach the threshold, the freezing gets less strong, 
and H uncov (^ ^ 

(c) + 0) -» 0. 

In both limits we find the same equation for the probability distribution P{z) of the rescaled 
variable z, see appendix B for a derivation, /i is the appropriately rescaled Lagrange param- 
eter, it is negative as it describes a field which decreases the global magnetization from the 
maximum entropy point towards the threshold x c (c): 

/oo 
dz P(z) (27) 
-o 

with the Heaviside step function Q(z) and the Dirac distribution 5(z). P_ denotes a d-fold 
convolution product. The interpretation of this equation is simple: the effective field for a 
randomly chosen vertex i is given by the linear superposition of the local field induced by the 
Lagrangian multiplier, and the contribution of its di neighbors. If a neighbor has a negative 
field, then it is uncovered, and thus forces a positive field on i. If it has a non-negative field it 
does not imply any non- vanishing field on i. As P(z) is the histogram of fields for all vertices, 
equation ( |27j ) includes the average over the Poisson distribution (||) of vertex degrees. 
This equation has a very simple solution, 

00 W(9r\ m+2 

p(*)= E 4^y s{z+m » h (28) 

m— — 1 v ' 

with the Lambert-W function W which is simply defined by 

y = W{x) Hi = je s . (29) 



12 



Non-zero fields correspond to frozen (or backbone) spins, whereas the Dirac peak in z = de- 
scribes all spins which flip from one minimal VC to a next. The backbone size is consequently 
given by the total weight of all nonzero fields. From this we can calculate the threshold and 
the backbone, 

, 2W(2c) + W(2c) 2 
x c (c) = 1 

M=) = 1-^. (30) 

This result is completely consistent with the bounds of section |i] which is particularly inter- 
esting for very small c where these bounds are very close, see table 1. The result for x c (c) 
is displayed in Fig. ^| along with numerical data, which were obtained by the variant of the 
branch-and-bound algorithm which always looks for a cover of minimum cardinality. For 
each treated concentration c of the edges and system sizes N = 12, 17, 25, 35, 50, 70, 100, 
140 for 10000 different realizations of the random graphs (only 1000 for the n > 100) the 
threshold was calculated. The average value is denoted with x c (c,N). Then for each value 
of c the behavior of the infinite graph was extrapolated by performing a fit of the function 
x c (N) = x c + aN~ b to the data, where x c , a and b are tunable parameters. The inset shows 
an example of such a kind of extrapolation. The result of x c as a function of c shows a very 
good coincidence with the analytic result. This is true not only for small concentrations but 
also for a region beyond the percolation threshold, whereas systematic deviations appear for 
larger c. 

Are there more complicated solutions to (|27j) which coincide with the numerics also for 
larger c? At first we remark that this equation is closed under 



m=—l 



P®(z)= £ ag>6(z-m!i) (31) 

for every positive integer I. The equations for a_\, ...,a^_\ close, all other weights with non- 
negative indices follow. A simple analysis of these equations shows, that for c < e/2 they 
have no non-trivial solution with only non-negative weights, up to this point ( p8| ) gives the 
only valuable solution. For c > e/2 non-trivial solutions with an arbitrary number of peaks 
appear. 

Together with the above mentioned accordance of bounds and numerical data for low 
vertex degrees, this leads to the following conjecture: For random graphs with c < e/2 
the exact values for the covering threshold and the backbone at this threshold are given by 
equation $3(\). For c > e/2, the above value for x c (c) still gives a lower bound. 

The last statement follows from the fact that in the replica approach the saddle point 
with the largest free energy has to be taken. Imagine now two different values for x c would 
be predicted by two different saddle points. In between these thresholds, one solution already 
predicts a positive energy and hence a larger free energy than the other. This saddle point 
has to be preferred, and it corresponds to the larger threshold. 

The transition at c = e/2 is not yet understood as also the multi-peak solutions ( |3l| ) 
do not coincide with numerical data. This can be seen in particular from the behavior of 
the backbone size - which is largely overestimated analytically, see Fig. ^. Especially the 
minimum of 6 c (c) at c = e/2 cannot be found in the numerical data. The numerical results 
were obtained from the enumerating of all possible covers at the threshold for the same 
range of concentrations and sizes mentioned above. Also the same extrapolation technique 
to obtain the values for the infinite random graph was applied. 

For the discrepancy of the numerical backbone size with the analytical data for c > e/2, 
there are two possible explanations: 
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• In the analytics we count every spin as backbone which has magnetization tending 
to ±1 in the thermodynamic limit, whereas the in numerics we count only vertices 
which have magnetization equal to ±1 even for finite size. This difference can be rather 
drastic: e.g. for the fully connected graph of N vertices one needs N — 1 for a VC, 
the average magnetization is therefore 1 — 2/N. The analytics would count a backbone 
one, whereas the strict backbone vanishes. 

• Above c = e/2 (or even above c = 1/2) replica symmetry breaking could appear. 
This would correspond to a clustering of the VCs in configurations space, cf. Q for a 
discussion of this phenomenon for SAT. As was seen there, the backbone size sensitively 
depends on this question. This point is still under investigation. 

Let us go back to < c < e/2 where ( p8| ) was conjectured to be exact, and let us 
extract more information about the minimal VCs from our solution. Due to the simple 
geometrical nature of the underlying graphs, the VC problem allows a much more intuitive 
way of understanding results, in contrast for example to SAT. A first example was already 
given in section ^ where we gave simple examples for backbone and non-backbone structures. 
Let us now investigate the influence of the close environment of a vertex on its behavior, more 
precisely the influence of the vertex degree. The total distribution of (almost all) degrees is 
given by the Poisson law (03), but we can distinguish three distinct contributions: 

• The joint probability P(d,m = —1) that a vertex has degree d and magnetization 
to = —1, i.e. this vertex belongs to the backbone and is uncovered in all minimal VCs. 

• P(d,m = +1) gives the probability that a vertex has degree d and is covered in all 
minimal VCs. 

• The remaining part of vertices are not in the backbone, thus described by P(d, — 1 < 
to < +1). 

These quantities can be easily computed from P(z): according to the interpretation of the 
self-consistent equation ( p7j) we can calculate the effective-field distribution for a vertex of 
degree d which, in average, has typical neighbors: 

P d (z + f i)=\pJ d ]{z) (32) 



where P-{z) is exactly the quantity given in ([27]). Plugging our solution ( |2q ) into this 
equation, we find 



P(d,m= -l)=P d (z< 0)Po 2c (d) 



, c [2c-W(2c)] d 



pf , ^ ^.^ p, mp _ 2c W(2c)[2c-W{2c)] d ^ 
P{d, -1 < to < +1) = P d (z = 0)Po 2c (d) = e j- — — (33) 



d\ 
2, 

~d-l)\ 



P(d,m = +1) = P d (z > 0)Po 2c (d) 



_ 2c [2c +(d- l)W(2c)} [2c - W (2c)] 

(d-iy. 



d-l 



The results for c = 1 are displayed in Fig. [j] along with numerical data for TV = 17, 35, 70. 
Please note that the numerical results seem to converge towards the analytical one, thus 
showing an excellent coincidence of both approaches. The curves are easily understood: a 
vertex with degree has no neighbors. Therefore, it does not appear in any optimum cover 
and we obtain P(Q, to = —1) = 1, P(0, to > —1) = 0. With increasing degree the probability 
that a vertex is covered increases, thus the contribution of P(k, — 1 < to < +1) to Po2 C (d) 
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increases as well. For large degrees it is very probable that a vertex belongs to all VCs but 
even a finite fraction of vertices with m = — 1 remains. 

This behavior can also be studied by evaluating the average magnetization m(d) as a 
function of the degree. Here the analytical solution gives only lower and upper bounds since 
we are not able to precisely calculate the magnetization of the non-backbone spins: 

2 ( 1 + ,_ 1) ^)( 1 _^)-_ 1<mM<1 _ 2 ( 1 _^ (34) 

Results are displayed in Fig. |§|: with increasing size N of the graphs the numerical data 
approach the region inside the bounds. The magnetization turns out to be a monotonously 
increasing function of the vertex degree, as expected from the results for P(k,m). These 
results justify a posteriori the application of the heuristic within the algorithm: vertices 
having a large degree are at first included into the cover set. 



5.3.2 Approximating the VC entropy 

It is also interesting to go away from the threshold into the coverable phase, x > x c (c), and 
to ask for the number of VCs which is given by the cover entropy (|l^) . As the saddle point 
equations for P(h) are to hard to be solved directly, we have used a simple variational ansatz. 
For doing this, we plug a set of simple test functions into the free energy and optimize 
with respect to these, cf. |l| for an application in SAT. The simplest Ansatz is provided by 
taking a Gaussian distribution, 

P (var) W = ^-^A J K^f\ (35) 



V2^AT "I 2AT2 

which includes only two free parameters. Note that the resulting fields h have already the 
linear scaling with temperature T which is needed for the limit T — > in the coverable phase. 
Using the rescaled variable z = h/ (T\/A), we get the following variational expression for the 
VC entropy: 



(36) 



(x,c) = J Dz 3 Z{Z 2 2Zo/VA) log[2cosh(VAz + z )] 

exp{-\/A(zi + z 2 ) - 2z } 



Dzi / Dz2 log 



1 - 



4cosh(vAzi + Zq) cosh(vAz2 + Za) 



Dz denotes the normal Gaussian measure dz e~ z l 2 /^m. This expression has to be opti- 
mized with respect to the parameters A and zq which fulfill the additional constraint 

/oo 
L>ztanh(VA~z + z ) = 2x - 1 . (37) 
-oo 

Fig. ^ compares the resulting entropy with numerical enumerations of all VCs for graphs 
with c = 1.0 as a function of x. Because of the large numerical effort, only graphs with 
N < 50 were considered. Deep inside the coverable region, the value of appears to be 

a very good approximation, as the numerical values approach it with increasing graph sizes 
N . Near the threshold the Gaussian ansatz ( |35| ) starts to fail as it includes only one scale for 
the fields and thus is not able to reflect the partial freezing into backbone and non-backbone 
spins. Comparable results were also obtained for other values of c. 
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6 Conclusions and outlook 



In this paper the vertex-cover problem on random graphs with a finite average vertex degree 
was studied. The problem was investigated using several methods. Numerical calculations 
with an exact branch-and-bound algorithm were performed. The coverability of a graph 
shows a sharp transition in the cardinality xN of vertex covers at the threshold x c (c). There 
are almost surely no VCs with x < x c (c), whereas they exist almost surely for x > x c (c). 
This transitions is related to a jump in the median complexity of the algorithm, and in the 
size of the backbone as well. 

A cluster expansion for non-percolated graphs gives very precise estimates of threshold 
and backbone for small c. Two approaches coming from the statistical physics of disordered 
systems were applied to the VC problem. The annealed approximation reproduces a known 
graph-theoretical lower bound. A more sophisticated method is given by the replica ansatz, 
which allows to derive analytical expression for the threshold x c (c) and the backbone 6 c (c) 
for average vertex degrees less than the Eulerian constant e, where also the agreement with 
numerical data is excellent. These expressions are conjectured to be exact. Beyond the 
average connectivity 2c = e, the replica symmetric ansatz fails to produce valuable results, 
and more complicated methods including replica symmetry breaking should be applied in 
future. 

We have also given a variational approximation for the vertex cover entropy, i.e. the 
logarithm of the number of VCs of given cardinality. Whereas this approximation was rather 
precise far above the covering threshold, the latter can be described only by going beyond a 
simple Gaussian approximation. The behavior for x ^ x c (c) deserves further investigation. 

It would also be interesting to consider different graph ensemble, e.g. graphs of constant 
vertex degree or graphs having locally non-tree-like structures. 
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A Calculation of the annealed bound 

In this appendix we calculate the annealed bound for the covering threshold. As stated in 
(|20|), it follows from the average of the partition function over the random graph ensemble. 
Here we use the second formulation, see |2.2|, where edges are drawn with probability 2c/N: 



Z(T,x\{J itj }) 



ex P {- J ff({^},{J lJ })/T} 



C X ({S,}) 



Y II eM-JiJs it -i5 Sj ,-i/T} 



C4{Si}) l<i<j<N 




C«({Si}) l<i<i<AT 




1G 



j exp{c7V(l - a;) 2 (e~ T 1 - 1) + o(N)} 
cxpjiV -xlog(x) - (1 - x)log(l - x) +c(l -x) 2 (e" T_1 - 1) + o(N)\ 



( N 

\xN 



where the last expression was obtained using Stirlings formula. This gives the annealed 
entropy from section 5.2 in the limit T — > 0. 



B Calculation of the free energy 

The main problem in calculating the free-energy density consists in the average of the log- 
arithm of the partition function over the ensemble of random graphs. The replica trick is 
based on the simple equality 

Z n — 1 

log Z = lim (38) 

n— >0 n 

which is valid for positive real Z. It allows to calculate the average of Z n . In principle, 
this problem is not easier than before. But the trick used in statistical physics is the fol- 
lowing: We calculate Z n at first for positive integer n, and try to obtain some analytical 
continuation at the end. The n-fold power can be understood in terms of n identical copies 
{Sf}, o = l, ...,n, of the original system. Every of these copies has the same Hamiltonian 
(fHf), including identical edges Jij, and fulfills the same magnetization constraint (jlj). The 
average over random graphs is calculated analogously to the last appendix, cf. section (5.1) 
for the notations, 



Z n (T,x,c) := Z n {T,x\{Ji,j}) 



= y, ex PS- T - 1 E J «E^.-^;.-4 (39) 

C»({S?}) [ i<3 a=l 

= £ exp|-d\r+|53expj-T- 1 X;* Sfl _ 1 «s. 1 _ 1 ^+ (iV) 

C m ({Sf}) [ i<j I a=l 

This can be simplified by introducing the 2™ order parameters which are enumerated by 
3 €{+1,-1}": 

1 N N 

En («) 

i— 1 a— 1 

c((?) measures the fraction of vertices i having the replicated spin (Sj, 5") = a. We find 



z n (T,x, c) = y o n'^n^^yx (4i) 

xexp J -c7V + ciV^c(CT)c(f)expi -T" 1 ^ (5^ _ 1( 5 T a _! 1 + o(N) 

{ 3,r I o=l J 

The integration is over all c(c?) which are normalized, X)ct c (^) = 1) an d fulfill the magneti- 
zation constraint, Y^g c(a)a a — 2x — 1 for all a = 1, n. Using Sterlings formula we finally 
find 



Zn{T,X,c) — 



J JJ 'dc{a) exp i TV 



-c- E c (^) lo S c ( ,5; ) 
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-c^c(o)c(f)expi -T- 1 5^k«,-i<Sr»,-i I +o(JV) 



<x,r k a— 1 J 

= eap (5 s )] + o(JV)} (42) 

The dominant term of O(N) in the exponent is given by the saddle point co(a), 

log co (a) = X 1 +X 2 J2 (ja + 2c J] co (r) cxp J — T _1 ^ <Wi<5 r ..-i 1 (43) 

a r ^ a— 1 J 

where Ai is a Lagrange parameter for the normalization of c(a), and A2 a second one for the 
magnetization constraint. 

The problem which remains is the continuation to real n. We have to introduce some 
ansatz on the structure of co(er). The simplest one is based on the observation, that Z n (T, x, c) 
is by definition invariant under permutations of the n replicas which were introduced as being 
identical. We therefore assume this symmetry also for the order parameter co(<?) which 
consequently depends only on s = ^ Q a a . We may express it by a generating function, 

/oo e T ~ lfls 
J hp ^ (2 COSh T-^r (44) 

which is normalized whenever P(h) is normalized, dh P{h) — 1. The magnetization 
condition now reads dh P(h) tanhT" 1 ^ = 2x-l. 

Plugging this replica symmetric ansatz into g n [co(a)], we get ( p4| ) by some straight- 
forward algebra from 

f(T,x,c) = -Tliin -g n [co(0)] . (45) 

n-*0 n 

Also the saddle point equation (|26| ) for P(h) can be easily calculated from (ff3|). 



C The saddle point equation at the threshold 



In order to calculate the saddle point equation at the threshold, we take the first procedure 
proposed in section 5.3.1, i.e. we approach the threshold from above, using the scaling 
h = TH cov z with some random variable z drawn from the distribution P(z). In the limit 
T — » 0, (|26|) slightly simplifies (A = H cov fj,): 



dz P(z) e H ^ zs = exp { -2c - H covf is + 2c / dz P(z) 



1 



I _|_ e -2H aov z 



(46) 



If we approach the threshold, H cov is diverging. In order to obtain a reasonable limit, we 
have to keep t — H cov s finite in this limit: 



/ 



dz P{z) e 2 



exp ^ —2c — fit + 2c / dz P{z) lim 



1 



1 + e -2ff...' 



exp \ -2c - fit + 2c dz P(z) + 2c / dz P(z)e 



(47) 



Developing the exponential for the last two terms, we find the desired equation. 
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Figure 1: Probability P CO v(x) that a cover exists for a random realization (c = 1.0) as a 
function of the fraction x of covered vertices. The result is shown for three different system 
sizes N — 25, 50, 100 (averaged for 10 4 - 10 3 samples). Lines are guides to the eyes only. In 
the left part, where the P cov is zero, the energy e (see text) is displayed. The inset enlarges 
the result for the energy in the region 0.3 < x < 0.5. 




Figure 2: Time complexity of the vertex cover: Median number of nodes visited in the 
backtracking tree as a function of the fraction x of covered vertices for graph sizes N — 
20,25,30,35,40 (c — 1.0). The inset shows the region below the threshold with logarithmic 
scale, including also data for N — 45, 50. The fact that in this representation the lines are 
equidistant demonstrates that the time complexity grows exponentially with N. 
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Figure 3: Median number of nodes visited in the backtracking tree as a function of the 
fraction x of covered vertices, displayed separately for the cases of coverable and uncoverable 
graphs (N = 30, c = 1.0). Additionally a scatter plot of the number of nodes for 100 
realizations is presented: for each run a dot is included in the figure. 




Figure 4: The fractional size b(x) of the backbone as a function of the relative cardinality x 
of the vertex cover. The results are for the case c = 1.0 and for the system sizes N = 25, 35, 
50, 70, and 100. The inset shows results for N = 50. There the fractional backbone sizes are 
displayed either for the subset of graphs which are coverable with xN vertices (cov) or for 
uncoverable graphs (uncov). The total function b(x) is almost the minimum of both curves. 
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Figure 5: Phase diagram: critical fraction x c of covered vertices as a function of the edge 
density c. For x > x c , almost all graphs have covers with xN vertices, while they have 
almost surely no cover for x < x c . The solid line shows the analytic result. The circles 
represent the results of the numerical simulations. Error bars are much smaller than symbol 
sizes. The upper bound of Harant is given by the dashed line, the bounds of Gazmuri by the 
dash-dotted lines. The vertical line is at c = e/2. Inset: All numerical values were calculated 
from finite-size scaling fits of x c (N, c) using functions x c {N) = x c + aN~ b . We show the data 
for c = 1.0 as an example. 




Figure 6: The backbone size b c at the critical point as a function of c. The solid line shows 
the analytic result. The numerical results are represented by the error bars. They were 
obtained from finite-size scaling fits similar to the calculation for x c (c). The vertical line is 
at c = e/2. 



22 




Figure 7: Distribution of degrees d at the threshold (c = 1.0). We show the total distribution 
of the degrees, determined by the ensemble of random graphs, as well as results describing 
the minimal vertex covers. The total distribution is divided into three contributions arising 
from the vertices which either are not in the backbone (magnetization — 1 < m < 1) or which 
are in the backbone and have magnetizations m — 1 or m = — 1. Analytical predictions are 
represented by the lines (which are guides to the eyes only, connecting the results for integer 
arguments), while the numerical results for N = 17, 35, 70 are displayed using the symbols. 
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Figure 8: The average magnetization of a vertex at the threshold as a function of its degree 
d. The lower and upper bounds obtained from the analytical calculation in the N — > oo limes 
are shown by the lines. The symbols display the numerical results for N = 17, 35, 70. 
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Figure 9: Entropy of the configurations as a function of the relative cardinality x of the vertex 
cover. The symbols represent results from the numerical enumerations, for different graph 
sizes N . The solid line displays the result from the Gaussian variational approximation. 
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